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Abstract 

We investigate joint graph inference for the chemical and electrical connectomes of 
the Caenorhabditis elegans roundworm. The C.elegans connectomes consist of 253 non¬ 
isolated neurons with known functional attributes, and there are two types of synaptic 
connectomes, resulting in a pair of graphs. We formulate our joint graph inference from 
the perspectives of seeded graph matching and joint vertex classification. Our results sug¬ 
gest that connectomic inference should proceed in the joint space of the two connectomes, 
which has significant neuroscientific implications. 

The Caenorhabditis elegans ( C.elegans ) is a non-parasitic, transparent roundworm approxi¬ 
mately one millimeter in length. The majority of C.elegans are female hermaphrodites. Maupas 
[1901] first described the worm in 1900 and named it Rhabditis elegans. It was later categorized 
under the subgenus Caenorhabditis by Osche [1952], and then, in 1955, raised to the generic 
status by Ellsworth Dougherty, to whom much of the recognition for choosing C.elegans as 
a model system in genetics is attributed [Riddle et ah, 1997]. The long name of this nema¬ 
tode mixes Greek and Latin, where Caeno means recent, rhabditis means rod-like, and elegans 
means elegant. 

Research on C.elegans rose to prominence after the nematode was adopted as a model 
organism: an easy-to-maintain non-human species widely studied, so that discoveries on this 
model organism might offer insights for the functionality of other organisms. The discoveries 
of caspases [Yuan et ah, 1993], RNA interference [Fire et ah, 1998], and microRNAs [Lee et ah, 
1993] are among some of the notable research using C.elegans. 
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Connectomes, the mapping of neural connections within the nervous system of an organism, 
provide a comprehensive structural characterization of the neural network architecture, and 
represent an essential foundation for basic neurobiological research. Applications based on the 
discovery of the connectome patterns and the identification of neurons based on their connectiv¬ 
ity structure give rise to significant challenges and promise important impact on neurobiology. 
Recently there has been an increasing interest in the network properties of C.elegans connec¬ 
tomes. The hermaphrodite C. elegans is the only organism with a fully constructed connectome 
[Sulston et al., 1983], and has one of the most highly studied nervous systems. 

Studies on the C.elegans connectomes traditionally focus on utilizing one single connec¬ 
tome alone [Varshney et ah, 2011, Pavlovic et al., 2014, Sulston et ah, 1983, Towlson et ah, 
2013], although there are many connectomes available. Notably, Varshney et al. [2011] dis¬ 
covered structural properties of the C.elegans connectomes via analyzing the connectomes’ 
graph statistics. Pavlovic et al. [2014] estimated the community structure of the connectomes, 
and their findings are compatible with known biological information on the C.elegans nervous 
system. 

Our new statistical approach of joint graph inference looks instead at jointly utilizing the 
paired chemical and electrical connectomes of the hermaphrodite C. elegans. We formulate our 
inference framework from the perspectives of seeded graph matching and joint vertex classifica¬ 
tion, which we will explain in Section 2. This framework gives a way to examine the structural 
similarity preserved across multiple connectomes within species, and make quantitative com¬ 
parisons between joint connectome analysis and single connectome analysis. We found that 
the optimal inference for the information-processing properties of the connectome should pro¬ 
ceed in the joint space of the C.elegans connectomes, and using the joint connectomes predicts 
neuron attributes more accurately than using either connectome alone. 

1 The Hermaphrodite C.elegans Connectomes 

The hermaphrodite C.elegans connectomes consist of 302 labeled neurons for each organ¬ 
ism. The C.elegans somatic nervous system has 279 neurons connecting to each other across 
synapses. There are many possible classifications of synaptic types. Here we consider two 
types of synaptic connections among these neurons: chemical synapses and electrical junction 
potentials. These two types of connectivity result in two synaptic connectomes consisting of 
the same set of neurons. 

We represent the connectomes as graphs. A graph is a representation of a collection of 
interacting objects. The objects are referred to as nodes or vertices. The interactions are 
referred to edges or links. In a connectome, the vertices represent neurons, and the edges 
represent synapses. Mathematically, a graph G = (V, E) consists of a set of vertices or nodes 
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V = [n] := {1, 2, n} and a set of edges Ed. If the graph is undirected and non-loopy, then 
the edge set E = (^) = (^), where (^) denotes all (unordered) pairs {vi,Vj} for all v^Vj G V 
and Vi % Vj. If the graph is directed and non-loopy, there are n(n — 1) possible edges, and 
we can represent them as ordered pairs. In this work, we assume the graphs are undirected, 
weighted, and non-loopy. The adjacency matrix A of G is the n-by-n matrix in which each 
entry A l3 denotes the edge existence between vertex i and j: A XJ = 1 if an edge is present, and 
Aij = 0 if an edge is absent. The adjacency matrix is symmetric, binary and hollow, i.e., the 
diagonals are all zeros. 

For the hermaphrodite C.elegans worm, the chemical connectome G c is weighted and di¬ 
rected. The electrical gap junctional connectome G g is weighted and undirected. This is con¬ 
sistent with an important characteristic of electrical synapses - they are bidirectional [Purves 
et ah, 2001]. The chemical connectome G c has 3 loops and no isolated vertices, while the elec¬ 
trical gap junctional connectome G g has no loops and 26 isolated vertices. Both connectomes 
are sparse. The chemical connectome G c has 2194 directed edges out of 279 • 278 possible 
ordered neuron pairs, resulting in a sparsity level of approximately 2.8%. The electrical gap 
junctional connecome G g has 514 undirected edges out of ( 2 .)°) possible unordered neuron pairs, 
resulting in a sparsity level of approximately 1.3%. 

In our analysis, we are interested in the 279 — 26 = 253 non-isolated neurons in the 
hermaphrodite C.elegans somatic nervous system. Each of these 253 neurons can be clas¬ 
sified in a number of ways, including into 3 non-overlapping connectivity based types: sensory 
neurons (27.96%), interneurons (29.75%) and motor neurons (42.29%). Here we will work with 
binary, symmetric and hollow adjacency matrices of the neural connectomes throughout. We 
symmetrize A by A A + A T , then binarize A by thresholding the positive entries of A to be 
1 and 0 otherwise, and finally set the diagonal entries of A to be zero. Indeed, we focus on the 
existence of synaptic connectomes, and the occurrence of loops is low (3 loops in G c and none 
in G g ) so we can ignore it. 

An image of the C.elegans worm body is seen in Figure 1. The pair of the neural con¬ 
nectomes are visualized in Figure 2. In the chemical connectome G c , the interneurons are 
heavily connected to the sensory neurons. The sensory neurons are connected more fre¬ 
quently to the motor neurons and interneurons than amongst themselves. In the electri¬ 
cal gap junction potential connectome G gl the motor neurons are heavily connected to the 
interneurons. The sensory neurons are connected more frequently to the motor neurons 
and interneurons than among themselves. The connectome dataset is accessible at http: 
//openconnecto .me/herm-c-elegans. Figure 3 presents the adjacency matrices of the paired 
C.elegans connectomes. 
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Figure 1: An image of the Caenorhabditis elegans ( C.elegans ) roundworm. The image is 
available at http: //post. queensu. ca/~chinsang/research/c-elegans .html. 



Figure 2: The pair of C.elegans neural connectomes visualized as graphs. Red nodes cor¬ 
respond to motor neurons, green nodes correspond to interneurons, and blue nodes correspond to 
sensory neurons. (Left) The chemical connectome G c . (Right) The electrical gap junctional con- 
nectome G g . Both synaptic connectomes are sparse, while G g is much sparser than G c . A similar 
connectivity pattern is seen across both connectomes. 
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Figure 3: (Left): The adjacency matrix A c of G c sorted according to the neuron types. (Right): The 
adjacency matrix A g of G g sorted according to the neuron types. The red block corresponds to the 
connectivity among the motor neurons, the green block corresponds to the connectivity among the 
interneurons, and the blue block corresponds to the connectivity among the sensory neurons. 
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Figure 4: A depiction of graph matching. Given two graphs G\ and G 2 , graph matching seeks 
an alignment (represented in the green lines) between the vertices across two graphs. 

2 Joint Graph Inference 

We consider an inference framework in the joint space of the C.elegans neural connectomes, 
which we refer to as joint graph inference. We focus on two aspects of joint graph inference: 
seeded graph matching and joint vertex classification. 

2.1 Seeded Graph Matching 

The problem of seeded graph matching (SGM) is a subproblem of the graph matching (GM) 
problem, which has wide applications in object recognition [Berg et ah, 2005, Caelli and Kosi- 
nov, 2004], image analysis [Conte et ah, 2003], computer vision [Cho and Lee, 2012, Zhou and 
De la Torre, 2012], and neuroscience [Haris et al., 1999, Vogelstein et al., 2011, Zaslavskiy 
et al., 2009]. Given two graphs, G\ = (Tj ,Ei) and G 2 = (V 2 , E 2 ) with respective adjacency 
matrices A\ and A 2 , and |Vj| = \V 2 \ = n, the GM problem seeks a bijection 4> between the 
vertex sets that minimizes edge disagreements [Lyzinski et al., 2014b, Fishkind et al., 2012], 
The graph matching problem is NP-hard [Van Leeuwen and Leeuwen, 1990]. It is not known 
whether any graph matching algorithm is efficient, and it is suspected that none exist. For a 
comprehensive survey on the graph matching problem, see Conte et al. [2004] and Vogelstein 
et al. [2011]. The intuitive idea of graph matching is seen in Figure 4. 

The seeded graph matching (SGM) problem employs additional constraint, where a partial 
correspondence between the vertices is known a priori. Those vertices are called “seeds”. 
Addition of seeds makes the graph matching problem has only a slight change in the graph 


5 











matching algorithm, and improves the graph matching performance [Fishkind et al., 2012]. Let 
Si C Vi and S 2 C V 2 be two subsets of the vertex sets, and suppose Si = S 2 = {1, 2,..., m}. 
The elements of Si and S 2 are called seeds, and the remaining n — m vertices are non-seeds. 
In the SGM problem, one seeks a a bijection, with constraint on Si such that (f>s 1 = (f>s 2 , to 
minimize the number of induced edge disagreements. 

The SGM problem, as a subproblem of GM, is NP-hard. We seek an approximated SGM 
solution that is computationally efficient. The performance of SGM solutions is measured by 
the matching accuracy S(m), defined as the number of correctly matched non-seeded vertices 
divided by the total number of non-seeds n — m. When the number of seeds m is given, the 
remaining n — m vertices need to be matched. Hence, the chance matching accuracy is yzyp 
and this accuracy increases as m increases. For larger values of m, more information on the 
partial correspondence between the vertices is available, and thus the SGM matching accuracy 
becomes higher. In this work, we apply the state-of-the-art SGM algorithm developed by 
Fishkind et al. [2012], seek the correspondence between the two types of neuron connectomes, 
and study the joint structure of the worm neural connectomes. 


2.2 Joint Vertex Classification 


When we observe the adjacency matrix A G {0,on n vertices and the class labels { Y, 
associated with the first (n — 1 ) training vertices, the task of vertex classification is to predict 
the label Y of the test vertex v. In this case study, the class labels are the neuron types: 
motor neurons, interneurons and sensory neurons. In this work, we assume the correspondence 
between the vertex sets across the two graphs is known. Given two graphs G\ = ( V. E\ ) and 
G 2 = ( V ., £ 2 ) where V = {v\,... ,v n -i,v}, and given the class labels { Y] }''T, 1 associated with 
the first (n — 1 ) training vertices, the task of joint vertex classification predicts the label of a 
test vertex v using information jointly from G\ and Gb. 

Fusion inference merges information on multiple disparate data sources in order to obtain 
more accurate inference than using only single source. Our joint vertex classification con¬ 


where A\ and A 2 are the adjacency matrices of 


sists of two main steps: first, a fusion information technique, namely the omnibus embedding 
methodology by Priebe et al. [2013]; and secondly, the inferential task of vertex classification. 
The step of omnibus embedding proceeds as follows. Given G\ and Gb, we construct an 

omnibus matrix M = ( 1 | G M 2nx2n . where A\ and ..4 2 are the adjacency matrices of 

\ A A 2 J 

Gi and Gb respectively, and the off-diagonal block is A = \{A\ + A 2 ). We consider adjacency 

spectral embedding [Sussman et al., 2012] of M as 2 n points into M d . Let U = 1 € 

U 2 

denote the resulted joint embedding, where U\ £ M. nXd is the joint embedding corresponding to 
G 1 , and U 2 G M. nXd to G 2 . Our inference task is vertex classification. Let T n -\ '■= U\(\n — 1],: 
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) G M( n-1 ) xd denote the training set containing the first n — 1 vertices. We train a classifier on 
Tn-i, and classify the test vertex v. A depiction of the joint vertex classification procedure is 
seen in Figure 5. 


• ^Classify v 


Joint Embedding Classification 


• • • 



Figure 5: A depiction of joint vertex classification. An illustration of joint vertex classification, 
which embeds the joint adjacency matrix - the omnibus matrix, and classifies on the embedded space. 

We demonstrate that fusing both pairs of the neural connectomes generates more accurate 
inference results than using a single source of connectome alone. We consider single vertex 
classification for comparison, which embeds the adjacency matrix A\ to W ! via adjacency 
spectral embedding, and classifies on the embedded space. A depiction of the single vertex 
classification procedure is seen in Figure 6. 



Classification 



Figure 6: A depiction of single vertex classification. An illustration of single vertex classification, 
which embeds one single adjacency matrix, and classifies on the embedded space. 
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3 Discoveries from the Joint Space of the Neural Con- 
nectomes 

3.1 Finding the Correspondence between the Chemical and the 
Electrical Connectomes 

We apply seeded graph matching on the paired C. elegans neural connectomes, and discover the 
underlying structure preserved across the chemical and the electrical connectomes [Fishkind 
et al., 2012], Figure 7 presents the errorbar plot of the seeded graph matching accuracy 8(m), 
plotted in black, against the number of seeds m € {0, 20, 40,..., 180}. For each selected number 
of seeds m, we randomly and independently select 100 seeding sets S\. For each seeding set Si 
at a given number of seeds m, we apply the state-of-the-art seeded graph matching algorithm 
[Fishkind et ah, 2012], The mean accuracy 8(m) is obtained by averaging the accuracies over 
the 100 Monte Carlo replicates at each m. As m increases, the matching accuracy improves. 
This is expected, because more seeds give more information, making the SGM problem less 
difficult. The chance accuracy, plotted in brown dashed line, at each m is which does not 
increase significantly as m increases. 

We note two significant neurological implications based on our graph matching result. First, 
SGM on the pair of connectomes indicate that the chemical and the electrical connectomes 
have statistically significant similar connectivity structure. The second significant implication 
is: If the performance of SGM on the chemical and the electrical connections were perfect, 
then one could consider just one (either one) of the paired neural connectomes without losing 
much information. If performance of SGM on the chemical and the electrical connections were 
no better than random vertex alignment, then it suggests that there is no structure similarity 
across the two connectomes, and this further suggests that analysis on the connectomes should 
proceed separately and individually. In fact, the seeded graph matching result on the C.elegans 
neural connectome is much more significant than chance but less than a perfect matching. This 
demonstrates that the optimal inference should be performed in the joint space of the chemical 
and the electrical connectomes. This discovery is noted in [Lyzinski et ah, 2014a], 

3.2 Predicting Neuron Types from the Joint Space of the Chemical 
and the Electrical Connectomes 

The result of SGM on the C. elegans neural connectomes demonstrates the advantage of in¬ 
ference in the joint space of the neural connectomes, and provides a statistical motivation to 
apply our proposed joint vertex classification approach. Furthermore, the neurological motiva¬ 
tion of applying joint vertex classification stems from illustrating a methodological framework 



Seeded graph matching on Ac and Ag 



6 20 60 ' 100 140 180 

Number of seeds 


Figure 7: Seeded graph matching on the C.elegans neural connectomes. For each selected 
number of seeds m € {0, 20,40,..., 180}, we randomly select 100 independent seeding sets ,S'i and 
apply SGM for each Monte Carlo replicate. The SGM mean accuracy 8(m ), plotted in black, is 
obtained by averaging the accuracies over the 100 Monte Carlo replicates. As the number of seeds 
m increases, the accuracy increases. The chance accuracy, plotted in brown dashed line, is much 
lower than the SGM accuracy. This suggests that a significant similarity exists between the two 
types of synapse connections. The SGM performance on the C.elegans neural connectome is much 
more significant than chance but less than a perfect matching, indicating the optimal inference must 
proceed in the joint space of both neural connectomes. 


to understand the coexistence and significance of chemical and electrical synaptic connectomes. 

We apply joint vertex classification and single vertex classification on the paired C.elegans 
neural connectomes, and compare the classification performance. The validation is done via 
leave-one-out cross validation. Here we do not investigate which embedding dimension d 
or classifier are optimal for our classification task, and we choose support vector machine 
classifier with radial basis [Cortes and Vapnik, 1995] for the classification step. The paired 
plots in Figure 8 present the misclassification errors against the embedding dimensions d E 
{2, 5,8,..., 116,119}. For the chemical connectome, the joint vertex classification (plotted in 
black) outperforms the single vertex classification (plotted in magenta) at all the considered 
embedding dimensions. For the electrical connectome, the joint vertex classification (plot¬ 
ted in black) outperforms the single vertex classification (plotted in magenta) at most of the 
considered embedding dimensions, especially larger valued embedding dimensions. 

The superior performance of the joint vertex classification over the single vertex classifi¬ 
cation has an important neuroscientific implication. In many animals, the chemical synapses 
co-exist with the electrical synapses. Modern understanding of coexistence of chemical and 
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Figure 8: Classification performance of joint and single vertex classification. (Left) Clas¬ 
sification on the chemical connectome A c . For all embedding dimensions d E {2,5,8,, 116,119}, 
the error rate of joint vertex classification, plotted in magenta, is lower than the single vertex clas¬ 
sification, plotted in black. (Right) Classification on the electrical gap junctional connectome A g . 
For most of the embedding dimensions especially with larger values, the error rate of joint vertex 
classification, plotted in magenta, is lower than the single vertex classification, plotted in black. Our 
classification result indicates that using information from the joint space of the neural connectomes 
improves classification performance. 


electrical synaptic connectomes suggest such a coexistence has physiological significance. We 
discover that using both chemical and electrical connectomes jointly generates better classifi¬ 
cation performance than using one connectome alone. This may serve as a first step towards 
providing a methodological and quantitative approach towards understanding the coexistent 
significance. 


4 Summary and Discussion 

The paired Caenorhabditis elegans connectomes have become a fascinating dataset for moti¬ 
vating a better understanding of the nervous connectivity systems. We have presented the 
unique statistical approach of joint graph inference - inference in the joint graph space - to 
study the worm’s connectomes. Utilizing jointly the chemical and the electrical connectomes, 
we discover statistically significant similarity preserved across the two synaptic connectome 
structures. Our result of seeded graph matching indicates that the optimal inference on the 
information-processing properties of the connectomes must proceed in the joint space of the 
paired graphs. 

The development of seeded graph matching provides a strong statistical motivation for 
joint vertex classification, where we predict neuron types in the joint space of the paired con¬ 
nectomes. Joint vertex classification outperforms the single vertex classification against all 
embedding dimensions for our different choices of dissimilarity measures. Fusion inference 
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using both the chemical and the electrical connectomes produces more accurate results than 
using one (either one) connectome alone, and enhances our understanding of the C.elegans con¬ 
nectomes. The chemical and the electrical synapses are known to coexist in most organisms. 
Our proposed joint vertex classification provides a methodological and quantitative frame¬ 
work for understanding the significance of the coexistence of the chemical and the electrical 
synapses. Further development of joint graph inference is a topic of ongoing investigation in 
both neuroscience and statistics. 
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